Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules

نویسندگان

  • Advaith Siddharthan
  • Angrosh Mandya
چکیده

We present an approach to text simplification based on synchronous dependency grammars. The higher level of abstraction afforded by dependency representations allows for a linguistically sound treatment of complex constructs requiring reordering and morphological change, such as conversion of passive voice to active. We present a synchronous grammar formalism in which it is easy to write rules by hand and also acquire them automatically from dependency parses of aligned English and Simple English sentences. The grammar formalism is optimised for monolingual translation in that it reuses ordering information from the source sentence where appropriate. We demonstrate the superiority of our approach over a leading contemporary system based on quasi-synchronous tree substitution grammars, both in terms of expressivity and performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text simplification using synchronous dependency grammars: Generalising automatically harvested rules

We present an approach to text simplification based on synchronous dependency grammars. Our main contributions in this work are (a) a study of how automatically derived lexical simplification rules can be generalised to enable their application in new contexts without introducing errors, and (b) an evaluation of our hybrid system that combines a large set of automatically acquired rules with a ...

متن کامل

Automatic Learning of Parallel Dependency Treelet Pairs

Induction of synchronous grammars from empirical data has long been a problem unsolved; despite that generative synchronous grammars theoretically suit the machine translation task very well. This fact is mainly due to pervasive structural divergences between languages. This paper presents a statistical approach to learn dependency structure mappings from parallel corpora. The algorithm introdu...

متن کامل

Hybrid Grammars for Discontinuous Parsing

We introduce the concept of hybrid grammars, which are extensions of synchronous grammars, obtained by coupling of lexical elements. One part of a hybrid grammar generates linear structures, another generates hierarchical structures, and together they generate discontinuous structures. This formalizes and generalizes some existing mechanisms for dealing with discontinuous phrase structures and ...

متن کامل

Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text u...

متن کامل

Automatic induction of rules for text simplification

Long and complicated sentences pose various problems to many state-of-the-art natural language technologies. We have been exploring methods to automatically transform such sentences as to make them simpler. These methods involve the use of a rule-based system, driven by the syntax of the text in the domain of interest. Hand-crafting rules for every domain is time-consuming and impractical. This...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014